Statistical Inference & Measures of Effect

Adam La Caze
School of Pharmacy
The University of Queensland

May 2021

Introduction

Objectives

  1. Understand the logic of statistical testing in clinical trials
  2. Demonstrate an understanding of key statistical concepts: hypothesis testing, \(p\) values, confidence intervals and power
  3. Be able to describe and interpret common measures of effect used in clinical epidemiology
  4. Be able to critically appraise a clinical trial

Clinical epidemiology

Bias

refers to any systematic error in estimating the effect of a drug, exposure or risk factor on a specified outcome

Random error

variation that occurs within stochastic processes

Sources of error

A map of different types of bias that lead to inaccurate estimation

Critical Appraisal

The task of critical appraisal is to appraise the paper and decide whether the findings are of relevance to your practice.

The simplest advice to give about critical appraisal is simply to read the paper and think about it.

Critical Appraisal Tool (Oxford Centre for Evidence-based Medicine)

  1. Determine the PICO for the study
  2. Appraise the study methods
  3. Interpret the results of the study
  4. Consider external validity

Determining the PICO for a study

PICO Notes
Participants Who are the participants in the study. What were the inclusion/exclusion criteria? Who participated?
Intervention What is the intervention under investigation?
Comparator What standard treatment is the intervention being tested against? Might be placebo or active control.
Outcome What is the primary endpoint/outcome of the study

Appraising Voysey et al. (2021): PICO and primary results

Task

Focusing on the primary endpoint of Voysey et al. (2021):

  1. Identify the PICO
  2. Identify the null hypothesis
  3. Interpret the statistical result

Based on the information available to you, what inference does the study support?

1. Determine the PICO

PICO Notes
Participants Participants of one of four vaccine RCTs. Most participants were adults in professions at risk of Covid-19.
Intervention Two doses of AZ-Oxford vaccine
Comparator Control: meningococcal vaccine or saline
Outcome Virologically confirmed, symptomatic Covid-19 (NAAT-positive swab, plus fever, cough, shortness of breath or anosmia or ageusia

Primary endpoint and analysis plan

The primary outcome was virologically confirmed, symptomatic COVID-19, defined as a NAAT-positive swab combined with at least one qualifying symptom (fever \(\ge\) 37.8°C, cough, shortness of breath, or anosmia or ageusia (103)

… each study had to meet prespecified criteria of having at least five cases eligible for inclusion in the primary outcome before a study was included in efficacy analyses (103)

Vaccine efficacy was calculated as 1 – adjusted relative risk (ChAdOx1 nCoV-19 vs control groups) computed using a Poisson regression model with robust variance (103)

Outcome

There were 30 (0.5%) cases among 5807 participants in the vaccine arm and 101 (1.7%) cases among 5829 participants in the control group, resulting in vaccine efficacy of 70.4% (95.8% CI 54.8–80.6; table 2; figure).

How would you interpret this result?

Understanding key aspects of the statistical inference

  1. Consider what outcomes we would expect if the vaccine didn’t work and we repeated these studies many times. This is called the null hypothesis and informs the statistical model.
  2. Use what we know about the statistical model assuming the null hypothesis is true to determine how large the studies would need to be to reliably identify a clinically important effect (lower bound of CI \(>\) 20%). This is a consideration of study power.
  3. Conduct the experiment. Compare the observed efficacy with what we would expect if the null hypothesis was true.
  4. If the observed efficacy is considerably larger than 20%, then the study provides evidence that the null hypothesis is false.
  5. The probability of observing a result more extreme than the one we did can be calculated using the statistical model assuming the null hypothesis is true—this is the \(p\) value.
  6. Alternatively, we can use the 95% confidence intervals around the observed efficacy to determine whether the results are statistically (and clinically) significant.

Appraising Voysey et al. (2021) using the Critical Appraisal Tool

2. Appraise the study methods

  • Was the assignment of patients to treatments randomized?

Yes

  • Were the groups similar at the start of the trial

Yes—at least on a per-trial level. The participants in each trial differ quite a bit (see Table 1).

  • Aside from the allocated treatment, were groups treated equally?

Lots of differences (e.g. timing of second dose, dosing, follow-up). That said: methods pre-approved with regulators.

  • Were all patients who entered the trial accounted for? And were they analysed in the groups to which they were randomized?

See appendix for CONSORT participant flow diagram. Participants analysed according to vaccines they received (intention-to-treat analysis conducted as a sensitivity analysis)

  • Were the measures objective or were the patients and clinicians kept “blind” to which treatment was being received?

Measures seem appropriately objective. The studies differed in terms of masking (single, double blind).

3. Interpret the results

  • How large was the treatment effect?

Efficacy: 70.4% (95.8% CI 54.8–80.6)

  • How precise was the estimate of the treatment effect?

This is a judgment call. Given this is an interim result and the lower bound of the confidence interval is considerably greater than 20%, it seems reasonable to suggest the result is sufficiently precise.

4. External validity

  • Will the results help me in caring for my patient?
  • Alternatively: How will I apply the results?

Take homes

  • The most-often used cut-off for \(p\) values used in clinical research is 0.05.
  • If the \(p\) value is \(< 0.05\), the result will be considered statistically significant
  • A statistically significant result for the primary endpoint of a trial is more trustworthy than statistically significant results on secondary endpoints or subgroups—the trial was set up the test the primary endpoint.
  • Once you have determined that the primary endpoint of trial was statistically significant, the next question is to determine whether the magnitude of the effect is clinically significant

Measures of effect

  1. Be able to describe and interpret common measures of effect used in clinical epidemiology

Measuring the effects of interventions

Relative risk

\[RR = \frac{I_t}{I_c}\]

Relative risk reduction (or increase)

\[ RRR = 1 - RR \]

The absolute risk difference (absolute risk reduction or absolute risk increase)

\[ARR = I_c - I_t \]

\[ARI = I_t - I_c\]

The number needed to treat (\(NNT\)) and/or harm (\(NNH\))

\[ NNT = 100/ARR \]

\[ NNT = 100/ARI \]

Voysey et al. (2021)

Endpoint No endpoint Total
Treatment 30 5777 5807
Control 101 5708 5809

\[ I_t = 30/5807 = 0.005 = 0.5\%\]

\[ I_c = 101/5809 = 0.017 = 1.7\% \]

\[RR = \frac{0.5}{1.7} = 0.294\]

\[ RRR = 0.706\]

\[ARR = 1.7 - 0.5 = 1.2\%\]

\[NNT = \frac{100}{1.2} = 83.333\]

Review objectives

  1. Understand the logic of statistical testing in clinical trials
  2. Demonstrate an understanding of key statistical concepts: hypothesis testing, \(p\) values, confidence intervals and power
  3. Be able to describe and interpret common measures of effect used in clinical epidemiology
  4. Be able to critically appraise a clinical trial

References

Centre for Evidence-Based Medicine. n.d. “Critical Appraisal Tools - CEBM.” Accessed March 5, 2018. https://www.cebm.net/2014/06/critical-appraisal/.

Greenhalgh, T. 1997. “How to Read a Paper: Assessing the Methodological Quality of Published Papers.” BMJ 315 (7103): 305–8. https://doi.org/10.1136/bmj.315.7103.305.

Voysey, Merryn, Sue Ann Costa Clemens, Shabir A. Madhi, Lily Y. Weckx, Pedro M. Folegatti, Parvinder K. Aley, Brian Angus, Vicky L. Baillie, Shaun L. Barnabas, and Qasim E. Bhorat. 2021. “Safety and Efficacy of the ChAdOx1 nCoV-19 Vaccine (AZD1222) Against SARS-CoV-2: An Interim Analysis of Four Randomised Controlled Trials in Brazil, South Africa, and the UK.” The Lancet 397 (10269): 99–111. https://doi.org/10.1016/S0140-6736(20)32661-1.